MGMT 675
AI-Assisted Financial Analysis

Data Handling

Data Handling

Topics

  • Merge
  • Filter
  • Sort
  • Aggregate by group
  • Transform

Datasets

  • metrics.xlsx and tickers.xlsx.
  • Online data from various sources.

Merge metrics and tickers

  • Ask Julius to merge the datasets on the ticker column
  • Ask Julius the following.
    • How many rows are there?
    • What are the column names?
    • What are the unique values in the category column?
    • What are the unique values in the sector column?
    • Show the head of the data frame.

Filter

  • Ask Julius to filter on the category column to “Domestic Common Stock” and “Domestic Common Stock Primary Class.” Ask Julius to call this data frame common_stock.
  • Ask Julius to create a copy of the common_stock data frame that contains only rows for which pe \(>0\).
  • Ask Julius to create a copy of the common_stock data frame that contains only rows for which marketcap is above the median marketcap.

Sort

  • Ask Julius to sort on marketcap in descending order and to show the head of the data frame.

Aggregate by group

  • Ask Julius to describe marketcap.
  • Ask Julius to compute the mean marketcap by sector.
  • Ask Julius to compute the number of firms by sector.
  • Ask Julius to compute the total marketcap by sector.

  • Ask Julius to compute the mean pe grouped by (sector, scalemarketcap) and to display the results as a two-dimensional table.
  • Ask Julius to recreate the table using only rows for which pe \(>0\).
  • Ask Julius to compute the percent of firms for which pe \(<0\) by sector.

Transform

  • Ask Julius to create a new variable equal to the rank of marketcap in descending order.
  • Ask Julius to create a new variable that is 1 if pe \(>0\) and 0 otherwise (a dummy variable).
  • Ask Julius to create a new variable equal to the excess of pb over the median sector pb.

Online data

Yahoo Finance

  • Daily open, high, low, close, adjusted close, volume
  • Income statement, balance sheet, and statement of cash flows for past 5 years
  • Current market option data (bid, ask, last price, open interest, implied volatility, …)
  • Can get with yfinance library

Yahoo’s Adjusted Closing Prices

  • Yahoo’s adjusted closing prices are adjusted for splits and dividends.
  • The percent change in the adjusted closing price is the daily close-to-close return including dividends.

Caveat

On ex-dividend days, the percent change in the adjusted closing price is \[\frac{P_{t}}{P_{t-1}-D_t} - 1\] rather than what we might prefer: \[\frac{P_{t} + D_t}{P_{t-1}} - 1\] but this is a minor issue (small difference 4 days a year).

Monthly, Annual, … Returns

If we want returns at a different frequency, for example annual returns, then we can either

  • compound the daily returns, or
  • downsample the adjusted closing prices to annual data and compute the percent change of the downsampled data.

Example

  • Ask Julius to use yfinance to get adjusted closing prices for SPY for the longest history available.
  • Ask Julius to downsample the prices to end-of-month.
  • Ask Julius to compute monthly returns as the percent change in the downsampled prices.

Federal Reserve Economic Data

  • Ask Julius to use the pandas-datareader to get the history of crude oil prices from FRED.
  • Ask Julius to get the history of inflation rates from FRED.

Ken French’s Data Library

  • Ask Julius to get the Fama-French factors from Ken French’s data library.
  • Ask Julius to list the datasets on Ken French’s data library.
  • Ask Julius to get the 48 industry returns from Ken French’s data library.

Scraping

  • Ask Julius to find the constituents of the S&P 100.
  • When Julius provides a link, ask Julius to read the table at the link.